Extending Static Synchronization Beyond SIMD and VLIW

نویسنده

  • Henry G. Dietz
چکیده

A key advantage of SIMD (Single Instruction stream, Multiple Data stream) architectures is that synchronization is effected statically at compile-time, hence the execution-time cost of synchronization between “processes” is essentially zero. VLIW (Very Long Instruction Word) machines are successful in large part because they preserve this property while providing more flexibility in terms of what kinds of operations can be parallelized. In this paper, we propose a new kind of architecture — the “static barrier MIMD” or SBM — which can be viewed as a further generalization of the parallel execution abilities of static synchronization machines. Barrier MIMDs are asynchronous Multiple Instruction stream Multiple Data stream architectures capable of parallel execution of loops, subprogram calls, and variable-execution-time instructions; however, little or no run-time synchronization is needed. When a group of processors within a barrier MIMD has just encountered a barrier, any conceptual synchronizations between the processors are statically accomplished with zero cost — as in a SIMD or VLIW and using similar compiler technology. Unlike these machines, however, as execution continues the relative timing of processors may become less precisely knowable as a static, compile-time, quantity. Where this imprecision becomes too large, the compiler simply inserts a synchronization barrier to insure that timing imprecision at that point is zero, and again employs purely static, implicit, synchronization. Both the architecture and the supporting compiler technology are discused in detail.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Initial Evaluation of Multimedia Extensions on VLIW Architectures

Media processing has motivated strong changes in the focus and design of processors. The inclusion of μSIMD multimedia extensions such as MMX is a cost effective option to improve the performance of those regions of the program with large amounts of DLP. This paper provides an initial evaluation of μSIMD and vector-SIMD enhanced VLIW architectures. We show that these two architectures execute r...

متن کامل

Evaluating Signal Processing and Multimedia Applications on SIMD, VLIW and Superscalar Architectures

This paper aims to provide a quantitative understanding of the performance of DSP and multimedia applications on very long instruction word (VLIW), single instruction multiple data (SIMD), and superscalar processors. We evaluate the performance of the VLIW paradigm using Texas Instruments Inc.’s TMS320C62xx processor and the SIMD paradigm using Intel’s Pentium II processor (with MMX) on a set o...

متن کامل

Evaluating VLIW and SIMD Architectures for DSP and Multimedia Applications

Digital signal processing (DSP) and multimedia applications are expected to be the dominant workloads on future computer systems. In this paper, we evaluate the performance of a very long instruction word (VLIW) processor using Texas Instruments Inc.’s TMS320C6x and a single-instruction multiple-data (SIMD) processor using Intel’s Pentium II processor (with MMX) on a set of benchmarks. Our benc...

متن کامل

Programmable VLIW and SIMD Architectures for DSP and Multimedia Applications

Digital Signal Processing (DSP) and multimedia workloads are expected to be the dominant workloads on future computer systems. This is true in both low cost embedded applications that use specialized microprocessors like DSPs and in the generalpurpose processor market. Very Long Instruction Word (VLIW) architectures have multiple functional units to take advantage of vastly available Instructio...

متن کامل

Logic and Physical Synthesis Methodology for High Performance VLIW/SIMD DSP Core

We describe logic and physical synthesis methodology to achieve timing closure on a high-end VLIW/SIMD DSP processor core. The design comprises of approximately 200,000 placeable instances. The target frequency goal was to achieve 250 MHz in 130 nm technology. The VLIW/SIMD DSP is described using TIE (Tensilica Instruction Extension) language, which is a Verilog-like language for description of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998